Backward induction

Backward induction is the process of reasoning backwards in time, from the end of a problem or situation, to determine a sequence of optimal actions. It proceeds by first considering the last time a decision might be made and choosing what to do in any situation at that time. Using this information, one can then determine what to do at the second-to-last time of decision. This process continues backwards until one has determined the best action for every possible situation (i.e. for every possible information set) at every point in time.

In the mathematical optimization method of dynamic programming, backward induction is one of the main methods for solving the Bellman equation.^[1]^[2] In game theory, backward induction is a method used to compute subgame perfect equilibria in sequential games.^[3] The only difference is that optimization involves just one decision maker, who chooses what do at each point of time, whereas game theory analyzes how the decisions of several players interact. That is, by anticipating what the last player will do in each situation, it is possible to determine what the second-to-last player will do, and so on. In the related fields of automated planning and scheduling and automated theorem proving, the method is called backward search or backward chaining. In chess it is called retrograde analysis.

Backward induction has been used to solve games as long as the field of game theory has existed. John von Neumann and Oskar Morgenstern suggested solving zero-sum, two-person games by backward induction in their Theory of Games and Economic Behavior (1944), the book which established game theory as a field of study.^[4]^[5]

1 An example of decision-making by backward induction
2 An example of backward induction in game theory
3 Backward induction and economic entry
4 A paradox of backward induction
5 Notes

An example of decision-making by backward induction

Consider an unemployed person who will be able to work for ten more years t = 1,2,...,10. Suppose that each year in which she remains unemployed, she may be offered a 'good' job that pays $100, or a 'bad' job that pays $44, with equal probability (50/50). Once she accepts a job, she will remain in that job for the rest of the ten years. (Assume for simplicity that she cares only about her monetary earnings, and that she values earnings at different times equally, i.e., the discount rate is zero.)

Should this person accept bad jobs? To answer this question, we can reason backwards from time t = 10.

At time 10, the value of accepting a good job is $100; the value of accepting a bad job is $44; the value of rejecting the job that is available is zero. Therefore, if she is still unemployed in the last period, she should accept whatever job she is offered at that time.
At time 9, the value of accepting a good job is $200 (because that job will last for two years); the value of accepting a bad job is 2*$44 = $88. The value of rejecting a job offer is $0 now, plus the value of waiting for the next job offer, which will either be $44 with 50% probability or $100 with 50% probability, for an average ('expected') value of 0.5*($100+$44) = $72. Therefore regardless of whether the job available at time 9 is good or bad, it is better to accept that offer than wait for a better one.
At time 8, the value of accepting a good job is $300 (it will last for three years); the value of accepting a bad job is 3*$44 = $132. The value of rejecting a job offer is $0 now, plus the value of waiting for a job offer at time 9. Since we have already concluded that offers at time 9 should be accepted, the expected value of waiting for a job offer at time 9 is 0.5*($200+$88) = $144. Therefore at time 8, it is more valuable to wait for the next offer than to accept a bad job.

It can be verified by continuing to work backwards that bad offers should only be accepted if one is still unemployed at times 9 or 10; they should be rejected at all times up to t = 8. The intuition is that if one expects to work in a job for a long time, this makes it more valuable to be picky about what job to accept.

A dynamic optimization problem of this kind is called an optimal stopping problem, because the issue at hand is when to stop waiting for a better offer. Search theory is the field of microeconomics that applies problems of this type to contexts like shopping, job search, and marriage.

An example of backward induction in game theory

Consider the ultimatum game, where one player proposes to split a dollar with another. The first player (the proposer) suggests a division of the dollar between the two players. The second player is then given the option to either accept the split or reject it. If the second player accepts, both get the amount suggested by the proposer. If rejected, neither receives anything.

Consider the actions of the second player given any arbitrary proposal by the first player (that gives the second player more than zero). Since the only choice the second player has at each of these points in the game is to choose between something and nothing, one can expect that the second will accept. Given that the second will accept all proposals offered by the first (that give the second anything at all), the first ought to propose giving the second as little as possible. This is the unique subgame perfect equilibrium of the Ultimatum Game. (However, the Ultimatum Game does have several other Nash equilibria which are not subgame perfect.)

Backward induction and economic entry

Consider a dynamic game in which the players are an incumbent firm in an industry and a potential entrant to that industry. As it stands, the incumbent has a monopoly over the industry and does not want to lose some of its market share to the entrant. If the entrant chooses not to enter, the payoff to the incumbent is high (it maintains its monopoly) and the entrant neither loses nor gains (its payoff is zero). If the entrant enters, the incumbent can "fight" or "accommodate" the entrant. It will fight by lowering its price, running the entrant out of business (and incurring exit costs — a negative payoff) and damaging its own profits. If it accommodates the entrant it will lose some of its sales, but a high price will be maintained and it will receive greater profits than by lowering its price (but lower than monopoly profits).

Say that, the best response of the incumbent is to accommodate if the entrant enters. If the incumbent accommodates, the best response of the entrant is to enter (and gain profit). Hence the strategy profile in which the incumbent accommodates if the entrant enters and the entrant enters if the incumbent accommodates is a Nash equilibrium. However, if the incumbent is going to play fight, the best response of the entrant is to not enter. If the entrant does not enter, it does not matter what the incumbent chooses to do (since there is no other firm to do it to — note that if the entrant does not enter, fight and accommodate yield the same payoffs to both players; the incumbent will not lower its prices if the entrant does not enter). Hence fight is a best response of the incumbent if the entrant does not enter. Hence the strategy profile in which the incumbent fights if the entrant does not enter and the entrant does not enter if the incumbent fights is a Nash equilibrium. Since the game is dynamic, any claim by the incumbent that it will fight is not a credible threat because by the time the decision node is reached where it can decide to fight (i.e. the entrant has entered), it would be irrational to do so. Therefore this Nash equilibrium can be eliminated by backward induction.

A paradox of backward induction

The unexpected hanging paradox is a paradox related to backward induction. Suppose a prisoner is told that she will be hanged sometime between Monday and Friday of next week. However, the exact day will be a surprise (i.e. she will not know the night before that she will be executed the next day). The prisoner, interested in outsmarting her executioner, attempts to determine which day the execution will occur.

She reasons that it cannot occur on Friday, since if it had not occurred by the end of Thursday, she would know the execution would be on Friday. Therefore she can eliminate Friday as a possibility. With Friday eliminated, she decides that it cannot occur on Thursday, since if it had not occurred on Wednesday, she would know that it had to be on Thursday. Therefore she can eliminate Thursday. This reasoning proceeds until she has eliminated all possibilities. She concludes that she will not be hanged next week.

To her surprise, she is hanged on Wednesday.

Here the prisoner reasons by backward induction, but seems to come to a false conclusion. Note, however, that the description of the problem assumes it is possible to surprise someone who is performing backward induction. The mathematical theory of backward induction does not make this assumption, so the paradox does not call into question the results of this theory. Nonetheless, this paradox has received some substantial discussion by philosophers.

Notes

^ Jerome Adda and Russell Cooper, "Dynamic Economics: Quantitative Methods and Applications", Section 3.2.1, page 28. MIT Press, 2003.
^ Mario Miranda and Paul Fackler, "Applied Computational Economics and Finance", Section 7.3.1, page 164. MIT Press, 2002.
^ Drew Fudenberg and Jean Tirole, "Game Theory", Section 3.5, page 92. MIT Press, 1991.
^ John von Neumann and Oskar Morgenstern, "Theory of Games and Economic Behavior", Section 15.3.1. Princeton University Press. Third edition, 1953. (First edition, 1944.)
^ Mathematics of Chess, webpage by John MacQuarrie.

Topics in game theory

Definitions	Normal-form game · Extensive-form game · Cooperative game · Succinct game · Information set · Preference

Equilibrium concepts	Nash equilibrium · Subgame perfection · Bayesian-Nash · Perfect Bayesian · Trembling hand · Proper equilibrium · Epsilon-equilibrium · Correlated equilibrium · Sequential equilibrium · Quasi-perfect equilibrium · Evolutionarily stable strategy · Risk dominance · Pareto efficiency · Quantal response equilibrium · Self-confirming equilibrium · Strong Nash equilibrium · Markov perfect equilibrium

Strategies	Dominant strategies · Pure strategy · Mixed strategy · Tit for tat · Grim trigger · Collusion · Backward induction · Markov strategy

Classes of games	Symmetric game · Perfect information · Simultaneous game · Sequential game · Repeated game · Signaling game · Cheap talk · Zero–sum game · Mechanism design · Bargaining problem · Stochastic game · Large poisson game · Nontransitive game · Global games

Games	Prisoner's dilemma · Traveler's dilemma · Coordination game · Chicken · Centipede game · Volunteer's dilemma · Dollar auction · Battle of the sexes · Stag hunt · Matching pennies · Ultimatum game · Rock-paper-scissors · Pirate game · Dictator game · Public goods game · Blotto games · War of attrition · El Farol Bar problem · Cake cutting · Cournot game · Deadlock · Diner's dilemma · Guess 2/3 of the average · Kuhn poker · Nash bargaining game · Screening game · Trust game · Princess and monster game · Monty Hall problem

Theorems	Minimax theorem · Nash's theorem · Purification theorem · Folk theorem · Revelation principle · Arrow's impossibility theorem

See also	Tragedy of the commons · Tyranny of small decisions · All-pay auction · List of games in game theory · Confrontation Analysis